Controlling the Rate of GWAS False Discoveries
نویسندگان
چکیده
With the rise of both the number and the complexity of traits of interest, control of the false discovery rate (FDR) in genetic association studies has become an increasingly appealing and accepted target for multiple comparison adjustment. While a number of robust FDR-controlling strategies exist, the nature of this error rate is intimately tied to the precise way in which discoveries are counted, and the performance of FDR-controlling procedures is satisfactory only if there is a one-to-one correspondence between what scientists describe as unique discoveries and the number of rejected hypotheses. The presence of linkage disequilibrium between markers in genome-wide association studies (GWAS) often leads researchers to consider the signal associated to multiple neighboring SNPs as indicating the existence of a single genomic locus with possible influence on the phenotype. This a posteriori aggregation of rejected hypotheses results in inflation of the relevant FDR. We propose a novel approach to FDR control that is based on prescreening to identify the level of resolution of distinct hypotheses. We show how FDR-controlling strategies can be adapted to account for this initial selection both with theoretical results and simulations that mimic the dependence structure to be expected in GWAS. We demonstrate that our approach is versatile and useful when the data are analyzed using both tests based on single markers and multiple regression. We provide an R package that allows practitioners to apply our procedure on standard GWAS format data, and illustrate its performance on lipid traits in the North Finland Birth Cohort 66 cohort study.
منابع مشابه
A regression framework for the proportion of true null hypotheses
The false discovery rate is one of the most commonly used error rates for measuring and controlling rates of false discoveries when performing multiple tests. Adaptive false discovery rates rely on an estimate of the proportion of null hypotheses among all the hypotheses being tested. This proportion is typically estimated once for each collection of hypotheses. Here we propose a regression fra...
متن کاملImproving Power of Genome-Wide Association Studies with Weighted False Discovery Rate Control and Prioritized Subset Analysis
The issue of large-scale testing has caught much attention with the advent of high-throughput technologies. In genomic studies, researchers are often confronted with a large number of tests. To make simultaneous inference for the many tests, the false discovery rate (FDR) control provides a practical balance between the number of true positives and the number of false positives. However, when f...
متن کاملFalse discovery rate for scanning statistics
The false discovery rate is a criterion for controlling Type I error in simultaneous testing of multiple hypotheses. For scanning statistics, due to local dependence, clusters of neighbouring hypotheses are likely to be rejected together. In such situations, it is more intuitive and informative to group neighbouring rejections together and count them as a single discovery, with the false discov...
متن کاملA Stochastic Process Approach to False Discovery Rates
This paper extends the theory of false discovery rates (FDR) pioneered by Benjamini and Hochberg (1995). We develop a framework in which the False Discovery Proportion (FDP) – the number of false rejections divided by the number of rejections – is treated as a stochastic process. After obtaining the limiting distribution of the process, we demonstrate the validitiy of a class of procedures for ...
متن کاملA Stochastic Process Approach to False Discovery Control
This paper extends the theory of false discovery rates (FDR) pioneered (1995) 289–300]. We develop a framework in which the False Discovery Proportion (FDP)—the number of false rejections divided by the number of rejections—is treated as a stochastic process. After obtaining the limiting distribution of the process, we demonstrate the validity of a class of procedures for controlling the False ...
متن کامل